home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
dskut
/
filed39a.zip
/
file.man
< prev
next >
Wrap
Text File
|
1993-08-07
|
15KB
|
397 lines
FILE(1) FILE(1)
NNAAMMEE
_f_i_l_e - determine file type
SSYYNNOOPPSSIISS
ffiillee [ --cc ] [ --zz ] [ --LL ] [ --ff namefile ] [ --mm magicfile ]
file ...
DDEESSCCRRIIPPTTIIOONN
_F_i_l_e tests each argument in an attempt to classify it.
There are three sets of tests, performed in this order:
filesystem tests, magic number tests, and language tests.
The _f_i_r_s_t test that succeeds causes the file type to be
printed.
The type printed will usually contain one of the words
tteexxtt (the file contains only ASCII characters and is prob-
ably safe to read on an ASCII terminal), eexxeeccuuttaabbllee (the
file contains the result of compiling a program in a form
understandable to some UNIX kernel or another), or ddaattaa
meaning anything else (data is usually `binary' or non-
printable). Exceptions are well-known file formats (core
files, tar archives) that are known to contain binary
data. When modifying the file _/_e_t_c_/_m_a_g_i_c or the program
itself, pprreesseerrvvee tthheessee kkeeyywwoorrddss .. People depend on know-
ing that all the readable files in a directory have the
word ``text'' printed. Don't do as Berkeley did - change
``shell commands text'' to ``shell script''.
The filesystem tests are based on examining the return
from a _s_t_a_t(2) system call. The program checks to see if
the file is empty, or if it's some sort of special file.
Any known file types appropriate to the system you are
running on (sockets, symbolic links, or named pipes
(FIFOs) on those systems that implement them) are intuited
if they are defined in the system header file ssyyss//ssttaatt..hh.
The magic number tests are used to check for files with
data in particular fixed formats. The canonical example
of this is a binary executable (compiled program) aa..oouutt
file, whose format is defined in aa..oouutt..hh and possibly
eexxeecc..hh in the standard include directory. These files
have a `magic number' stored in a particular place near
the beginning of the file that tells the UNIX operating
system that the file is a binary executable, and which of
several types thereof. The concept of `magic number' has
been applied by extension to data files. Any file with
some invariant identifier at a small fixed offset into the
file can usually be described in this way. The informa-
tion in these files is read from the magic file
_/_e_t_c_/_m_a_g_i_c_.
If an argument appears to be an ASCII file, _f_i_l_e attempts
to guess its language. The language tests look for par-
ticular strings (cf _n_a_m_e_s_._h) that can appear anywhere in
Copyright but distributable 1
FILE(1) FILE(1)
the first few blocks of a file. For example, the keyword
..bbrr indicates that the file is most likely a troff input
file, just as the keyword ssttrruucctt indicates a C program.
These tests are less reliable than the previous two
groups, so they are performed last. The language test
routines also test for some miscellany (such as _t_a_r
archives) and determine whether an unknown file should be
labelled as `ascii text' or `data'.
Use --mm _f_i_l_e to specify an alternate file of magic numbers.
The --zz tries to look inside compressed files.
The --cc option causes a checking printout of the parsed
form of the magic file. This is usually used in conjunc-
tion with --mm to debug a new magic file before installing
it.
The --ff _n_a_m_e_f_i_l_e option specifies that the names of the
files to be examined are to be read (one per line) from
_n_a_m_e_f_i_l_e before the argument list. Either _n_a_m_e_f_i_l_e or at
least one filename argument must be present; to test the
standard input, use ``-'' as a filename argument.
The --LL option causes symlinks to be followed, as the like-
named option in _l_s(1).
FFIILLEESS
_/_e_t_c_/_m_a_g_i_c - default list of magic numbers
SSEEEE AALLSSOO
_m_a_g_i_c(4) - description of magic file format.
_S_t_r_i_n_g_s(1), _o_d(1) - tools for examining non-textfiles.
SSTTAANNDDAARRDDSS CCOONNFFOORRMMAANNCCEE
This program is believed to exceed the System V Interface
Definition of FILE(CMD), as near as one can determine from
the vague language contained therein. Its behaviour is
mostly compatible with the System V program of the same
name. This version knows more magic, however, so it will
produce different (albeit more accurate) output in many
cases.
The one significant difference between this version and
System V is that this version treats any white space as a
delimiter, so that spaces in pattern strings must be
escaped. For example,
>10 string language impress (imPRESS data)
in an existing magic file would have to be changed to
>10 string language\ impress (imPRESS data)
In addition, in this version, if a pattern string contains
a backslash, it must be escaped. For example
0 string \begindata Andrew Toolkit document
in an existing magic file would have to be changed to
Copyright but distributable 2
FILE(1) FILE(1)
0 string \\begindata Andrew Toolkit document
SunOS releases 3.2 and later from Sun Microsystems include
a _f_i_l_e(1) command derived from the System V one, but with
some extensions. My version differs from Sun's only in
minor ways. It includes the extension of the `&' opera-
tor, used as, for example,
>16 long&0x7fffffff >0 not stripped
MMAAGGIICC DDIIRREECCTTOORRYY
The magic file entries have been collected from various
sources, mainly USENET, and contributed by various
authors. Ian Darwin (address below) will collect addi-
tional or corrected magic file entries. A consolidation
of magic file entries will be distributed periodically.
The order of entries in the magic file is significant.
Depending on what system you are using, the order that
they are put together may be incorrect. If your old _f_i_l_e
command uses a magic file, keep the old magic file around
for comparison purposes (rename it to _/_e_t_c_/_m_a_g_i_c_._o_r_i_g).
HHIISSTTOORRYY
There has been a _f_i_l_e command in every UNIX since at least
Research Version 6 (man page dated January, 1975). The
System V version introduced one significant major change:
the external list of magic number types. This slowed the
program down slightly but made it a lot more flexible.
This program, based on the System V version, was written
by Ian Darwin without looking at anybody else's source
code.
John Gilmore revised the code extensively, making it bet-
ter than the first version. Geoff Collyer found several